YLAB@RU at Spoken Term Detection Task in NTCIR-9

نویسندگان

  • Yoichi Yamashita
  • Toru Matsunaga
  • Kook Cho
چکیده

The information retrieval based on speech recognition is an important technique to easy access to large amount of mul-timedia contents including speech. The development of spoken term detection (STD) techniques, which detect a given word or phrase from spoken documents, is widely conducted. This paper proposes a new method of STD based on the vector quantization (VQ). Spoken documents are represented as sequences of VQ codes, and they are matched with a text query to be detected based on the V-P score which measures the relationship between a VQ code and a phoneme. The representation of VQ codes is an intermediate form between acoustic features such as MFCC parameters and sub-word symbols which are often used in conventional STD methods. The dependency of acoustic features on a speaker is avoided by the speaker-dependent VQ.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

YLAB@RU at Spoken Term Detection Task in NTCIR-10 SpokenDoc-2

The development of spoken term detection (STD) techniques, which detect a given word or phrase from spoken documents, is widely conducted in order to realize easy access to large amount of multimedia contents including speech This paper describes improvement of the STD method which is based on the vector quantization (VQ) and has been proposed in NTCIR-9 SpokenDoc. Spoken documents are represen...

متن کامل

Spoken Document Retrieval Experiments for SpokenDoc at Ryukoku University (RYSDT)

In this paper, we describe spoken document retrieval systems in Ryukoku University, which were participated in NTCIR-9 IR for Spoken Documents (“SpokenDoc”) task. In NTCIR-9 “SpokenDoc” task, there are two subtasks: “Spoken term detection (STD) subtask” and “Spoken document retrieval (SDR) subtask”. We participated in the both subtasks as team RYSDT. In this paper, first, our STD systems are de...

متن کامل

Overview of the IR for Spoken Documents Task in NTCIR-9 Workshop

This paper describes an overview of the IR for Spoken Documents Task in NTCIR-9 Workshop. In this task, the spoken term detection (STD) subtask and ad-hoc spoken document retrieval subtask (SDR) are conducted. Both of the subtasks target to search terms, passages and documents included in academic and simulated lectures of the Corpus of Spontaneous Japanese. Finally, seven and five teams partic...

متن کامل

Spoken Document Retrieval Experiments for SpokenDoc-2 at Ryukoku University (RYSDT)

In this paper, we describe spoken document retrieval systems in Ryukoku University, which were participated in NTCIR-10 IR for Spoken Documents (“SpokenDoc-2”) task. In NTCIR-10 “SpokenDoc-2” task, there are two subtasks: “spoken term detection (STD) subtask” and “ad-hoc spoken content retrieval (SCR) subtask”. We participated in the SCR subtask as team RYSDT. In this paper, our SCR systems are...

متن کامل

Spoken Document Retrieval Experiments for SpokenQuery&Doc at Ryukoku University (RYSDT)

In this paper, we describe spoken document retrieval (SDR) systems in Ryukoku University, which were participated in NTCIR-11 “SpokenQuery&Doc” task. In NTCIR-11 SpokenQuery&Doc task, there are subtasks: “spoken content retrieval (SCR) subtask” and “spoken term detection (STD) subtask”. We participated in the SCR and STD subtasks as team RYSDT. In this paper, our SDR and STD systems are described.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011